Dataset statistics
| Number of variables | 16 |
|---|---|
| Number of observations | 320772 |
| Missing cells | 1147694 |
| Missing cells (%) | 22.4% |
| Duplicate rows | 5 |
| Duplicate rows (%) | < 0.1% |
| Total size in memory | 36.4 MiB |
| Average record size in memory | 119.1 B |
Variable types
| Text | 2 |
|---|---|
| Categorical | 3 |
| Numeric | 11 |
| Dataset has 5 (< 0.1%) duplicate rows | Duplicates |
countries_fr has a high cardinality: 722 distinct values | High cardinality |
brands has a high cardinality: 58784 distinct values | High cardinality |
countries_fr is highly imbalanced (77.4%) | Imbalance |
product_name has 17762 (5.5%) missing values | Missing |
brands has 28412 (8.9%) missing values | Missing |
energy_100g has 59659 (18.6%) missing values | Missing |
salt_100g has 65262 (20.3%) missing values | Missing |
sodium_100g has 65309 (20.4%) missing values | Missing |
fiber_100g has 119886 (37.4%) missing values | Missing |
additives_n has 71833 (22.4%) missing values | Missing |
sugars_100g has 75801 (23.6%) missing values | Missing |
fat_100g has 76881 (24.0%) missing values | Missing |
saturated_fat_100g has 91218 (28.4%) missing values | Missing |
nutrition_score_uk_100g has 99562 (31.0%) missing values | Missing |
nutrition_score_fr_100g has 99562 (31.0%) missing values | Missing |
nutrition_grade_fr has 99562 (31.0%) missing values | Missing |
cholesterol_100g has 176682 (55.1%) missing values | Missing |
energy_100g is highly skewed (γ1 = 491.0039771) | Skewed |
salt_100g is highly skewed (γ1 = 493.5037928) | Skewed |
sodium_100g is highly skewed (γ1 = 493.458469) | Skewed |
fiber_100g is highly skewed (γ1 = 363.5478054) | Skewed |
cholesterol_100g is highly skewed (γ1 = 221.1178099) | Skewed |
energy_100g has 8909 (2.8%) zeros | Zeros |
salt_100g has 34174 (10.7%) zeros | Zeros |
sodium_100g has 34131 (10.6%) zeros | Zeros |
fiber_100g has 68833 (21.5%) zeros | Zeros |
additives_n has 94259 (29.4%) zeros | Zeros |
sugars_100g has 37077 (11.6%) zeros | Zeros |
fat_100g has 64504 (20.1%) zeros | Zeros |
saturated_fat_100g has 68736 (21.4%) zeros | Zeros |
nutrition_score_uk_100g has 13588 (4.2%) zeros | Zeros |
nutrition_score_fr_100g has 12763 (4.0%) zeros | Zeros |
cholesterol_100g has 89441 (27.9%) zeros | Zeros |
Reproduction
| Analysis started | 2024-06-07 15:44:09.372314 |
|---|---|
| Analysis finished | 2024-06-07 15:44:35.447859 |
| Duration | 26.08 seconds |
| Software version | ydata-profiling v0.0.dev0 |
| Download configuration | config.json |
code
Text
| Distinct | 320749 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 23 |
| Missing (%) | < 0.1% |
| Memory size | 2.4 MiB |
Length
| Max length | 41 |
|---|---|
| Median length | 13 |
| Mean length | 12.763809 |
| Min length | 1 |
Characters and Unicode
| Total characters | 4093979 |
|---|---|
| Distinct characters | 10 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 320749 ? |
|---|---|
| Unique (%) | 100.0% |
Sample
| 1st row | 0000000003087 |
|---|---|
| 2nd row | 0000000004530 |
| 3rd row | 0000000004559 |
| 4th row | 0000000016087 |
| 5th row | 0000000016094 |
| Value | Count | Frequency (%) |
| 0000000003087 | 1 | < 0.1% |
| 0000000016650 | 1 | < 0.1% |
| 0000000016094 | 1 | < 0.1% |
| 0000000016100 | 1 | < 0.1% |
| 0000000016117 | 1 | < 0.1% |
| 0000000016124 | 1 | < 0.1% |
| 0000000016193 | 1 | < 0.1% |
| 0000000018579 | 1 | < 0.1% |
| 0000000016513 | 1 | < 0.1% |
| 0000000016872 | 1 | < 0.1% |
| Other values (320739) | 320739 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 1023511 | |
| 1 | 457173 | |
| 3 | 391050 | 9.6% |
| 2 | 388171 | 9.5% |
| 7 | 330357 | 8.1% |
| 4 | 329051 | 8.0% |
| 5 | 315989 | 7.7% |
| 8 | 304983 | 7.4% |
| 6 | 301681 | 7.4% |
| 9 | 252013 | 6.2% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 4093979 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 1023511 | |
| 1 | 457173 | |
| 3 | 391050 | 9.6% |
| 2 | 388171 | 9.5% |
| 7 | 330357 | 8.1% |
| 4 | 329051 | 8.0% |
| 5 | 315989 | 7.7% |
| 8 | 304983 | 7.4% |
| 6 | 301681 | 7.4% |
| 9 | 252013 | 6.2% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 4093979 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 1023511 | |
| 1 | 457173 | |
| 3 | 391050 | 9.6% |
| 2 | 388171 | 9.5% |
| 7 | 330357 | 8.1% |
| 4 | 329051 | 8.0% |
| 5 | 315989 | 7.7% |
| 8 | 304983 | 7.4% |
| 6 | 301681 | 7.4% |
| 9 | 252013 | 6.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 4093979 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 1023511 | |
| 1 | 457173 | |
| 3 | 391050 | 9.6% |
| 2 | 388171 | 9.5% |
| 7 | 330357 | 8.1% |
| 4 | 329051 | 8.0% |
| 5 | 315989 | 7.7% |
| 8 | 304983 | 7.4% |
| 6 | 301681 | 7.4% |
| 9 | 252013 | 6.2% |
countries_fr
Categorical
HIGH CARDINALITY  IMBALANCE 
| Distinct | 722 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 280 |
| Missing (%) | 0.1% |
| Memory size | 648.4 KiB |
| États-Unis | |
|---|---|
| France | |
| Suisse | 14953 |
| Allemagne | 7870 |
| Espagne | 5009 |
| Other values (717) |
Length
| Max length | 211 |
|---|---|
| Median length | 10 |
| Mean length | 8.6033848 |
| Min length | 4 |
Characters and Unicode
| Total characters | 2757316 |
|---|---|
| Distinct characters | 121 |
| Distinct categories | 9 ? |
| Distinct scripts | 7 ? |
| Distinct blocks | 8 ? |
Unique
| Unique | 390 ? |
|---|---|
| Unique (%) | 0.1% |
Sample
| 1st row | France |
|---|---|
| 2nd row | États-Unis |
| 3rd row | États-Unis |
| 4th row | États-Unis |
| 5th row | États-Unis |
Common Values
| Value | Count | Frequency (%) |
| États-Unis | 172998 | |
| France | 94392 | |
| Suisse | 14953 | 4.7% |
| Allemagne | 7870 | 2.5% |
| Espagne | 5009 | 1.6% |
| Royaume-Uni | 4825 | 1.5% |
| Belgique | 2595 | 0.8% |
| Australie | 2056 | 0.6% |
| Russie | 1315 | 0.4% |
| France,Suisse | 1224 | 0.4% |
| Other values (712) | 13255 | 4.1% |
Length
| Value | Count | Frequency (%) |
| états-unis | 172999 | |
| france | 94392 | |
| suisse | 14953 | 4.6% |
| allemagne | 7870 | 2.4% |
| espagne | 5009 | 1.6% |
| royaume-uni | 4825 | 1.5% |
| belgique | 2595 | 0.8% |
| australie | 2056 | 0.6% |
| russie | 1315 | 0.4% |
| france,suisse | 1224 | 0.4% |
| Other values (747) | 14549 | 4.5% |
Most occurring characters
| Value | Count | Frequency (%) |
| s | 393767 | |
| t | 352808 | |
| a | 305218 | |
| n | 297617 | |
| i | 209391 | |
| - | 180211 | |
| U | 179165 | |
| É | 173559 | |
| e | 162965 | |
| r | 105298 | 3.8% |
| Other values (111) | 397317 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 2061752 | |
| Uppercase Letter | 507351 | 18.4% |
| Dash Punctuation | 180211 | 6.5% |
| Other Punctuation | 6496 | 0.2% |
| Space Separator | 1295 | < 0.1% |
| Other Letter | 205 | < 0.1% |
| Decimal Number | 3 | < 0.1% |
| Nonspacing Mark | 2 | < 0.1% |
| Spacing Mark | 1 | < 0.1% |
Most frequent character per category
Other Letter
| Value | Count | Frequency (%) |
| ا | 23 | 11.2% |
| ل | 20 | 9.8% |
| ة | 13 | 6.3% |
| ع | 12 | 5.9% |
| ي | 10 | 4.9% |
| س | 10 | 4.9% |
| ن | 10 | 4.9% |
| د | 8 | 3.9% |
| م | 8 | 3.9% |
| 日 | 7 | 3.4% |
| Other values (33) | 84 |
Lowercase Letter
| Value | Count | Frequency (%) |
| s | 393767 | |
| t | 352808 | |
| a | 305218 | |
| n | 297617 | |
| i | 209391 | |
| e | 162965 | |
| r | 105298 | 5.1% |
| c | 99674 | 4.8% |
| u | 34763 | 1.7% |
| l | 28158 | 1.4% |
| Other values (30) | 72093 | 3.5% |
Uppercase Letter
| Value | Count | Frequency (%) |
| U | 179165 | |
| É | 173559 | |
| F | 98516 | |
| S | 17524 | 3.5% |
| A | 11625 | 2.3% |
| R | 7798 | 1.5% |
| E | 5403 | 1.1% |
| B | 4608 | 0.9% |
| P | 1870 | 0.4% |
| I | 1813 | 0.4% |
| Other values (18) | 5470 | 1.1% |
Other Punctuation
| Value | Count | Frequency (%) |
| , | 6274 | |
| : | 194 | 3.0% |
| ' | 28 | 0.4% |
Decimal Number
| Value | Count | Frequency (%) |
| 7 | 2 | |
| 6 | 1 |
Nonspacing Mark
| Value | Count | Frequency (%) |
| ั | 1 | |
| ี | 1 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 180211 |
Space Separator
| Value | Count | Frequency (%) |
| 1295 |
Spacing Mark
| Value | Count | Frequency (%) |
| ा | 1 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 2569094 | |
| Common | 188005 | 6.8% |
| Arabic | 148 | < 0.1% |
| Thai | 38 | < 0.1% |
| Han | 18 | < 0.1% |
| Cyrillic | 9 | < 0.1% |
| Devanagari | 4 | < 0.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| s | 393767 | |
| t | 352808 | |
| a | 305218 | |
| n | 297617 | |
| i | 209391 | |
| U | 179165 | |
| É | 173559 | |
| e | 162965 | |
| r | 105298 | 4.1% |
| c | 99674 | 3.9% |
| Other values (51) | 289632 |
Thai
| Value | Count | Frequency (%) |
| ร | 5 | |
| เ | 4 | 10.5% |
| า | 3 | 7.9% |
| อ | 3 | 7.9% |
| ท | 3 | 7.9% |
| ย | 2 | 5.3% |
| ส | 2 | 5.3% |
| ศ | 2 | 5.3% |
| ป | 2 | 5.3% |
| ะ | 2 | 5.3% |
| Other values (10) | 10 |
Arabic
| Value | Count | Frequency (%) |
| ا | 23 | |
| ل | 20 | |
| ة | 13 | |
| ع | 12 | |
| ي | 10 | 6.8% |
| س | 10 | 6.8% |
| ن | 10 | 6.8% |
| د | 8 | 5.4% |
| م | 8 | 5.4% |
| و | 7 | 4.7% |
| Other values (8) | 27 |
Common
| Value | Count | Frequency (%) |
| - | 180211 | |
| , | 6274 | 3.3% |
| 1295 | 0.7% | |
| : | 194 | 0.1% |
| ' | 28 | < 0.1% |
| 7 | 2 | < 0.1% |
| 6 | 1 | < 0.1% |
Cyrillic
| Value | Count | Frequency (%) |
| а | 3 | |
| з | 1 | 11.1% |
| н | 1 | 11.1% |
| т | 1 | 11.1% |
| с | 1 | 11.1% |
| х | 1 | 11.1% |
| К | 1 | 11.1% |
Han
| Value | Count | Frequency (%) |
| 日 | 7 | |
| 本 | 7 | |
| 港 | 2 | 11.1% |
| 香 | 2 | 11.1% |
Devanagari
| Value | Count | Frequency (%) |
| त | 1 | |
| र | 1 | |
| ा | 1 | |
| भ | 1 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2580776 | |
| None | 176314 | 6.4% |
| Arabic | 148 | < 0.1% |
| Thai | 38 | < 0.1% |
| CJK | 18 | < 0.1% |
| IPA Ext | 9 | < 0.1% |
| Cyrillic | 9 | < 0.1% |
| Devanagari | 4 | < 0.1% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| s | 393767 | |
| t | 352808 | |
| a | 305218 | |
| n | 297617 | |
| i | 209391 | |
| - | 180211 | |
| U | 179165 | |
| e | 162965 | |
| r | 105298 | 4.1% |
| c | 99674 | 3.9% |
| Other values (48) | 294662 |
None
| Value | Count | Frequency (%) |
| É | 173559 | |
| é | 1674 | 0.9% |
| è | 638 | 0.4% |
| ï | 286 | 0.2% |
| ç | 63 | < 0.1% |
| ë | 43 | < 0.1% |
| ô | 28 | < 0.1% |
| ê | 17 | < 0.1% |
| Î | 6 | < 0.1% |
Arabic
| Value | Count | Frequency (%) |
| ا | 23 | |
| ل | 20 | |
| ة | 13 | |
| ع | 12 | |
| ي | 10 | 6.8% |
| س | 10 | 6.8% |
| ن | 10 | 6.8% |
| د | 8 | 5.4% |
| م | 8 | 5.4% |
| و | 7 | 4.7% |
| Other values (8) | 27 |
IPA Ext
| Value | Count | Frequency (%) |
| ə | 9 |
CJK
| Value | Count | Frequency (%) |
| 日 | 7 | |
| 本 | 7 | |
| 港 | 2 | 11.1% |
| 香 | 2 | 11.1% |
Thai
| Value | Count | Frequency (%) |
| ร | 5 | |
| เ | 4 | 10.5% |
| า | 3 | 7.9% |
| อ | 3 | 7.9% |
| ท | 3 | 7.9% |
| ย | 2 | 5.3% |
| ส | 2 | 5.3% |
| ศ | 2 | 5.3% |
| ป | 2 | 5.3% |
| ะ | 2 | 5.3% |
| Other values (10) | 10 |
Cyrillic
| Value | Count | Frequency (%) |
| а | 3 | |
| з | 1 | 11.1% |
| н | 1 | 11.1% |
| т | 1 | 11.1% |
| с | 1 | 11.1% |
| х | 1 | 11.1% |
| К | 1 | 11.1% |
Devanagari
| Value | Count | Frequency (%) |
| त | 1 | |
| र | 1 | |
| ा | 1 | |
| भ | 1 |
product_name
Text
MISSING 
| Distinct | 221347 |
|---|---|
| Distinct (%) | 73.0% |
| Missing | 17762 |
| Missing (%) | 5.5% |
| Memory size | 2.4 MiB |
Length
| Max length | 234 |
|---|---|
| Median length | 163 |
| Mean length | 25.935392 |
| Min length | 1 |
Characters and Unicode
| Total characters | 7858683 |
|---|---|
| Distinct characters | 1170 |
| Distinct categories | 21 ? |
| Distinct scripts | 12 ? |
| Distinct blocks | 21 ? |
Unique
| Unique | 196915 ? |
|---|---|
| Unique (%) | 65.0% |
Sample
| 1st row | Farine de blé noir |
|---|---|
| 2nd row | Banana Chips Sweetened (Whole) |
| 3rd row | Peanuts |
| 4th row | Organic Salted Nut Mix |
| 5th row | Organic Polenta |
| Value | Count | Frequency (%) |
| de | 27212 | 2.3% |
| 26484 | 2.2% | |
| chocolate | 11502 | 1.0% |
| sauce | 10673 | 0.9% |
| cheese | 10419 | 0.9% |
| organic | 9469 | 0.8% |
| with | 8416 | 0.7% |
| au | 8117 | 0.7% |
| mix | 7317 | 0.6% |
| à | 6356 | 0.5% |
| Other values (56210) | 1079210 |
Most occurring characters
| Value | Count | Frequency (%) |
| 909456 | 11.6% | |
| e | 784076 | 10.0% |
| a | 596106 | 7.6% |
| r | 459723 | 5.8% |
| i | 451258 | 5.7% |
| o | 406022 | 5.2% |
| t | 365920 | 4.7% |
| n | 354996 | 4.5% |
| s | 337707 | 4.3% |
| l | 323388 | 4.1% |
| Other values (1160) | 2870031 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 5721495 | |
| Uppercase Letter | 1009124 | 12.8% |
| Space Separator | 909509 | 11.6% |
| Other Punctuation | 142278 | 1.8% |
| Decimal Number | 44167 | 0.6% |
| Dash Punctuation | 16349 | 0.2% |
| Open Punctuation | 5273 | 0.1% |
| Close Punctuation | 5272 | 0.1% |
| Other Letter | 2965 | < 0.1% |
| Math Symbol | 1241 | < 0.1% |
| Other values (11) | 1010 | < 0.1% |
Most frequent character per category
Other Letter
| Value | Count | Frequency (%) |
| ا | 53 | 1.8% |
| ي | 43 | 1.5% |
| ل | 40 | 1.3% |
| ו | 40 | 1.3% |
| 克 | 37 | 1.2% |
| เ | 34 | 1.1% |
| 味 | 32 | 1.1% |
| و | 31 | 1.0% |
| ร | 31 | 1.0% |
| า | 30 | 1.0% |
| Other values (768) | 2594 |
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 784076 | |
| a | 596106 | |
| r | 459723 | 8.0% |
| i | 451258 | 7.9% |
| o | 406022 | 7.1% |
| t | 365920 | 6.4% |
| n | 354996 | 6.2% |
| s | 337707 | 5.9% |
| l | 323388 | 5.7% |
| u | 248650 | 4.3% |
| Other values (154) | 1393649 |
Uppercase Letter
| Value | Count | Frequency (%) |
| C | 154323 | |
| S | 121001 | |
| P | 86294 | 8.6% |
| B | 81287 | 8.1% |
| M | 64630 | 6.4% |
| F | 49108 | 4.9% |
| T | 42996 | 4.3% |
| G | 41986 | 4.2% |
| A | 38382 | 3.8% |
| O | 38335 | 3.8% |
| Other values (102) | 290782 |
Other Punctuation
| Value | Count | Frequency (%) |
| , | 86486 | |
| & | 21619 | 15.2% |
| ' | 16194 | 11.4% |
| % | 7926 | 5.6% |
| . | 3741 | 2.6% |
| ; | 2711 | 1.9% |
| ! | 1289 | 0.9% |
| / | 864 | 0.6% |
| : | 741 | 0.5% |
| * | 225 | 0.2% |
| Other values (17) | 482 | 0.3% |
Nonspacing Mark
| Value | Count | Frequency (%) |
| ี | 30 | |
| ้ | 20 | |
| ์ | 18 | |
| ่ | 16 | |
| ิ | 14 | |
| ́ | 8 | 5.9% |
| ุ | 7 | 5.1% |
| ั | 5 | 3.7% |
| ็ | 4 | 2.9% |
| ื | 4 | 2.9% |
| Other values (6) | 10 | 7.4% |
Other Symbol
| Value | Count | Frequency (%) |
| ® | 131 | |
| ° | 120 | |
| № | 9 | 3.1% |
| ♥ | 8 | 2.7% |
| ★ | 8 | 2.7% |
| ™ | 5 | 1.7% |
| 💧 | 3 | 1.0% |
| ℅ | 3 | 1.0% |
| � | 2 | 0.7% |
| © | 2 | 0.7% |
| Other values (4) | 4 | 1.4% |
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 13211 | |
| 1 | 8496 | |
| 2 | 6505 | |
| 5 | 3672 | 8.3% |
| 4 | 3171 | 7.2% |
| 3 | 3132 | 7.1% |
| 6 | 2079 | 4.7% |
| 8 | 1780 | 4.0% |
| 7 | 1338 | 3.0% |
| 9 | 783 | 1.8% |
Math Symbol
| Value | Count | Frequency (%) |
| + | 1212 | |
| | | 12 | 1.0% |
| = | 6 | 0.5% |
| ~ | 3 | 0.2% |
| > | 2 | 0.2% |
| < | 2 | 0.2% |
| × | 2 | 0.2% |
| ≤ | 1 | 0.1% |
| ~ | 1 | 0.1% |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 5219 | |
| [ | 42 | 0.8% |
| { | 7 | 0.1% |
| „ | 3 | 0.1% |
| ‚ | 1 | < 0.1% |
| ( | 1 | < 0.1% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 16331 | |
| – | 11 | 0.1% |
| — | 6 | < 0.1% |
| 〜 | 1 | < 0.1% |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 5224 | |
| ] | 41 | 0.8% |
| } | 6 | 0.1% |
| ) | 1 | < 0.1% |
Modifier Letter
| Value | Count | Frequency (%) |
| ー | 34 | |
| ー | 4 | 9.3% |
| ゙ | 4 | 9.3% |
| ゚ | 1 | 2.3% |
Control
| Value | Count | Frequency (%) |
| | 9 | |
| | 6 | |
| | 2 | 10.5% |
| | 2 | 10.5% |
Space Separator
| Value | Count | Frequency (%) |
| 909456 | ||
| 50 | < 0.1% | |
| 3 | < 0.1% |
Initial Punctuation
| Value | Count | Frequency (%) |
| « | 177 | |
| “ | 11 | 5.8% |
| ‘ | 1 | 0.5% |
Final Punctuation
| Value | Count | Frequency (%) |
| » | 175 | |
| ’ | 15 | 7.6% |
| ” | 7 | 3.6% |
Currency Symbol
| Value | Count | Frequency (%) |
| $ | 49 | |
| € | 14 | 21.9% |
| ¢ | 1 | 1.6% |
Modifier Symbol
| Value | Count | Frequency (%) |
| ` | 24 | |
| ´ | 11 | |
| ¨ | 1 | 2.8% |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 29 |
Format
| Value | Count | Frequency (%) |
| | 1 |
Other Number
| Value | Count | Frequency (%) |
| ² | 1 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 6702167 | |
| Common | 1124962 | 14.3% |
| Cyrillic | 28021 | 0.4% |
| Han | 1065 | < 0.1% |
| Thai | 528 | < 0.1% |
| Greek | 443 | < 0.1% |
| Arabic | 431 | < 0.1% |
| Hebrew | 290 | < 0.1% |
| Katakana | 285 | < 0.1% |
| Hiragana | 246 | < 0.1% |
| Other values (2) | 245 | < 0.1% |
Most frequent character per script
Han
| Value | Count | Frequency (%) |
| 克 | 37 | 3.5% |
| 味 | 32 | 3.0% |
| 糖 | 26 | 2.4% |
| 奶 | 23 | 2.2% |
| 牛 | 21 | 2.0% |
| 巧 | 19 | 1.8% |
| 棒 | 18 | 1.7% |
| 力 | 17 | 1.6% |
| 茶 | 16 | 1.5% |
| 果 | 15 | 1.4% |
| Other values (427) | 841 |
Latin
| Value | Count | Frequency (%) |
| e | 784076 | 11.7% |
| a | 596106 | 8.9% |
| r | 459723 | 6.9% |
| i | 451258 | 6.7% |
| o | 406022 | 6.1% |
| t | 365920 | 5.5% |
| n | 354996 | 5.3% |
| s | 337707 | 5.0% |
| l | 323388 | 4.8% |
| u | 248650 | 3.7% |
| Other values (154) | 2374321 |
Hangul
| Value | Count | Frequency (%) |
| 스 | 6 | 2.6% |
| 고 | 6 | 2.6% |
| 장 | 5 | 2.2% |
| 초 | 4 | 1.7% |
| 오 | 4 | 1.7% |
| 마 | 4 | 1.7% |
| 즈 | 4 | 1.7% |
| 쌀 | 4 | 1.7% |
| 다 | 4 | 1.7% |
| 시 | 4 | 1.7% |
| Other values (125) | 185 |
Common
| Value | Count | Frequency (%) |
| 909456 | ||
| , | 86486 | 7.7% |
| & | 21619 | 1.9% |
| - | 16331 | 1.5% |
| ' | 16194 | 1.4% |
| 0 | 13211 | 1.2% |
| 1 | 8496 | 0.8% |
| % | 7926 | 0.7% |
| 2 | 6505 | 0.6% |
| ) | 5224 | 0.5% |
| Other values (89) | 33514 | 3.0% |
Cyrillic
| Value | Count | Frequency (%) |
| о | 2981 | 10.6% |
| а | 2603 | 9.3% |
| н | 1974 | 7.0% |
| е | 1964 | 7.0% |
| и | 1708 | 6.1% |
| р | 1583 | 5.6% |
| с | 1523 | 5.4% |
| к | 1410 | 5.0% |
| л | 1316 | 4.7% |
| т | 999 | 3.6% |
| Other values (55) | 9960 |
Katakana
| Value | Count | Frequency (%) |
| ン | 23 | 8.1% |
| チ | 18 | 6.3% |
| ル | 17 | 6.0% |
| ラ | 15 | 5.3% |
| ク | 13 | 4.6% |
| ス | 13 | 4.6% |
| ト | 10 | 3.5% |
| タ | 9 | 3.2% |
| コ | 9 | 3.2% |
| キ | 8 | 2.8% |
| Other values (52) | 150 |
Hiragana
| Value | Count | Frequency (%) |
| ん | 25 | 10.2% |
| の | 15 | 6.1% |
| り | 15 | 6.1% |
| う | 14 | 5.7% |
| き | 10 | 4.1% |
| い | 10 | 4.1% |
| ご | 9 | 3.7% |
| し | 8 | 3.3% |
| ち | 7 | 2.8% |
| ま | 7 | 2.8% |
| Other values (45) | 126 |
Greek
| Value | Count | Frequency (%) |
| α | 43 | 9.7% |
| ο | 36 | 8.1% |
| ι | 29 | 6.5% |
| ρ | 27 | 6.1% |
| τ | 21 | 4.7% |
| λ | 19 | 4.3% |
| κ | 16 | 3.6% |
| ς | 16 | 3.6% |
| υ | 16 | 3.6% |
| ν | 15 | 3.4% |
| Other values (38) | 205 |
Thai
| Value | Count | Frequency (%) |
| เ | 34 | 6.4% |
| ร | 31 | 5.9% |
| า | 30 | 5.7% |
| ี | 30 | 5.7% |
| ส | 23 | 4.4% |
| ง | 23 | 4.4% |
| ย | 22 | 4.2% |
| น | 21 | 4.0% |
| ้ | 20 | 3.8% |
| ก | 18 | 3.4% |
| Other values (33) | 276 |
Arabic
| Value | Count | Frequency (%) |
| ا | 53 | 12.3% |
| ي | 43 | 10.0% |
| ل | 40 | 9.3% |
| و | 31 | 7.2% |
| م | 26 | 6.0% |
| ن | 25 | 5.8% |
| ر | 20 | 4.6% |
| ب | 20 | 4.6% |
| ة | 16 | 3.7% |
| ك | 15 | 3.5% |
| Other values (20) | 142 |
Hebrew
| Value | Count | Frequency (%) |
| ו | 40 | |
| י | 27 | 9.3% |
| מ | 24 | 8.3% |
| ל | 23 | 7.9% |
| פ | 20 | 6.9% |
| ר | 19 | 6.6% |
| ק | 19 | 6.6% |
| ת | 14 | 4.8% |
| ח | 12 | 4.1% |
| ט | 10 | 3.4% |
| Other values (17) | 82 |
Inherited
| Value | Count | Frequency (%) |
| ́ | 8 | |
| ︎ | 3 | 20.0% |
| ̀ | 2 | 13.3% |
| ̈ | 1 | 6.7% |
| ️ | 1 | 6.7% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 7757321 | |
| None | 70083 | 0.9% |
| Cyrillic | 28021 | 0.4% |
| CJK | 1065 | < 0.1% |
| Thai | 528 | < 0.1% |
| Arabic | 431 | < 0.1% |
| Katakana | 311 | < 0.1% |
| Hebrew | 290 | < 0.1% |
| Hiragana | 246 | < 0.1% |
| Hangul | 230 | < 0.1% |
| Other values (11) | 157 | < 0.1% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 909456 | 11.7% | |
| e | 784076 | 10.1% |
| a | 596106 | 7.7% |
| r | 459723 | 5.9% |
| i | 451258 | 5.8% |
| o | 406022 | 5.2% |
| t | 365920 | 4.7% |
| n | 354996 | 4.6% |
| s | 337707 | 4.4% |
| l | 323388 | 4.2% |
| Other values (84) | 2768669 |
None
| Value | Count | Frequency (%) |
| é | 37970 | |
| à | 6325 | 9.0% |
| è | 5876 | 8.4% |
| â | 3186 | 4.5% |
| ê | 1852 | 2.6% |
| ü | 1672 | 2.4% |
| û | 1506 | 2.1% |
| ä | 1185 | 1.7% |
| ô | 1080 | 1.5% |
| É | 937 | 1.3% |
| Other values (179) | 8494 | 12.1% |
Cyrillic
| Value | Count | Frequency (%) |
| о | 2981 | 10.6% |
| а | 2603 | 9.3% |
| н | 1974 | 7.0% |
| е | 1964 | 7.0% |
| и | 1708 | 6.1% |
| р | 1583 | 5.6% |
| с | 1523 | 5.4% |
| к | 1410 | 5.0% |
| л | 1316 | 4.7% |
| т | 999 | 3.6% |
| Other values (55) | 9960 |
Arabic
| Value | Count | Frequency (%) |
| ا | 53 | 12.3% |
| ي | 43 | 10.0% |
| ل | 40 | 9.3% |
| و | 31 | 7.2% |
| م | 26 | 6.0% |
| ن | 25 | 5.8% |
| ر | 20 | 4.6% |
| ب | 20 | 4.6% |
| ة | 16 | 3.7% |
| ك | 15 | 3.5% |
| Other values (20) | 142 |
Hebrew
| Value | Count | Frequency (%) |
| ו | 40 | |
| י | 27 | 9.3% |
| מ | 24 | 8.3% |
| ל | 23 | 7.9% |
| פ | 20 | 6.9% |
| ר | 19 | 6.6% |
| ק | 19 | 6.6% |
| ת | 14 | 4.8% |
| ח | 12 | 4.1% |
| ט | 10 | 3.4% |
| Other values (17) | 82 |
CJK
| Value | Count | Frequency (%) |
| 克 | 37 | 3.5% |
| 味 | 32 | 3.0% |
| 糖 | 26 | 2.4% |
| 奶 | 23 | 2.2% |
| 牛 | 21 | 2.0% |
| 巧 | 19 | 1.8% |
| 棒 | 18 | 1.7% |
| 力 | 17 | 1.6% |
| 茶 | 16 | 1.5% |
| 果 | 15 | 1.4% |
| Other values (427) | 841 |
Thai
| Value | Count | Frequency (%) |
| เ | 34 | 6.4% |
| ร | 31 | 5.9% |
| า | 30 | 5.7% |
| ี | 30 | 5.7% |
| ส | 23 | 4.4% |
| ง | 23 | 4.4% |
| ย | 22 | 4.2% |
| น | 21 | 4.0% |
| ้ | 20 | 3.8% |
| ก | 18 | 3.4% |
| Other values (33) | 276 |
Katakana
| Value | Count | Frequency (%) |
| ー | 34 | 10.9% |
| ン | 23 | 7.4% |
| チ | 18 | 5.8% |
| ル | 17 | 5.5% |
| ラ | 15 | 4.8% |
| ク | 13 | 4.2% |
| ス | 13 | 4.2% |
| ト | 10 | 3.2% |
| タ | 9 | 2.9% |
| コ | 9 | 2.9% |
| Other values (49) | 150 |
Hiragana
| Value | Count | Frequency (%) |
| ん | 25 | 10.2% |
| の | 15 | 6.1% |
| り | 15 | 6.1% |
| う | 14 | 5.7% |
| き | 10 | 4.1% |
| い | 10 | 4.1% |
| ご | 9 | 3.7% |
| し | 8 | 3.3% |
| ち | 7 | 2.8% |
| ま | 7 | 2.8% |
| Other values (45) | 126 |
Punctuation
| Value | Count | Frequency (%) |
| ’ | 15 | |
| • | 13 | |
| – | 11 | |
| … | 11 | |
| “ | 11 | |
| ” | 7 | |
| — | 6 | 7.4% |
| „ | 3 | 3.7% |
| ‘ | 1 | 1.2% |
| | 1 | 1.2% |
| Other values (2) | 2 | 2.5% |
Currency Symbols
| Value | Count | Frequency (%) |
| € | 14 |
Letterlike Symbols
| Value | Count | Frequency (%) |
| № | 9 | |
| ™ | 5 | |
| ℅ | 3 | 17.6% |
Misc Symbols
| Value | Count | Frequency (%) |
| ♥ | 8 | |
| ★ | 8 |
Diacriticals
| Value | Count | Frequency (%) |
| ́ | 8 | |
| ̀ | 2 | 18.2% |
| ̈ | 1 | 9.1% |
Hangul
| Value | Count | Frequency (%) |
| 스 | 6 | 2.6% |
| 고 | 6 | 2.6% |
| 장 | 5 | 2.2% |
| 초 | 4 | 1.7% |
| 오 | 4 | 1.7% |
| 마 | 4 | 1.7% |
| 즈 | 4 | 1.7% |
| 쌀 | 4 | 1.7% |
| 다 | 4 | 1.7% |
| 시 | 4 | 1.7% |
| Other values (125) | 185 |
VS
| Value | Count | Frequency (%) |
| ︎ | 3 | |
| ️ | 1 | 25.0% |
Specials
| Value | Count | Frequency (%) |
| � | 2 |
Latin Ext Additional
| Value | Count | Frequency (%) |
| ọ | 1 | |
| ế | 1 | |
| ắ | 1 | |
| ớ | 1 | |
| ở | 1 | |
| ạ | 1 | |
| ề | 1 | |
| ấ | 1 | |
| ị | 1 |
Enclosed Alphanum Sup
| Value | Count | Frequency (%) |
| 🅫 | 1 |
Dingbats
| Value | Count | Frequency (%) |
| ❤ | 1 |
Math Operators
| Value | Count | Frequency (%) |
| ≤ | 1 |
brands
Categorical
HIGH CARDINALITY  MISSING 
| Distinct | 58784 |
|---|---|
| Distinct (%) | 20.1% |
| Missing | 28412 |
| Missing (%) | 8.9% |
| Memory size | 3.7 MiB |
| Carrefour | 2978 |
|---|---|
| Auchan | 2340 |
| U | 2050 |
| Meijer | 1995 |
| Leader Price | 1700 |
| Other values (58779) |
Length
| Max length | 228 |
|---|---|
| Median length | 155 |
| Mean length | 15.327531 |
| Min length | 1 |
Characters and Unicode
| Total characters | 4481157 |
|---|---|
| Distinct characters | 633 |
| Distinct categories | 18 ? |
| Distinct scripts | 12 ? |
| Distinct blocks | 14 ? |
Unique
| Unique | 34837 ? |
|---|---|
| Unique (%) | 11.9% |
Sample
| 1st row | Ferme t'y R'nao |
|---|---|
| 2nd row | Torn & Glasser |
| 3rd row | Grizzlies |
| 4th row | Bob's Red Mill |
| 5th row | Unfi |
Common Values
| Value | Count | Frequency (%) |
| Carrefour | 2978 | 0.9% |
| Auchan | 2340 | 0.7% |
| U | 2050 | 0.6% |
| Meijer | 1995 | 0.6% |
| Leader Price | 1700 | 0.5% |
| Kroger | 1660 | 0.5% |
| Casino | 1608 | 0.5% |
| Ahold | 1370 | 0.4% |
| Spartan | 1341 | 0.4% |
| Roundy's | 1299 | 0.4% |
| Other values (58774) | 274019 | |
| (Missing) | 28412 | 8.9% |
Length
| Value | Count | Frequency (%) |
| inc | 35213 | 5.3% |
| foods | 14341 | 2.1% |
| llc | 8873 | 1.3% |
| company | 8786 | 1.3% |
| 8544 | 1.3% | |
| co | 7706 | 1.2% |
| food | 7136 | 1.1% |
| the | 5004 | 0.7% |
| carrefour | 4519 | 0.7% |
| market | 4124 | 0.6% |
| Other values (39713) | 565196 |
Most occurring characters
| Value | Count | Frequency (%) |
| 442357 | 9.9% | |
| e | 380059 | 8.5% |
| a | 327513 | 7.3% |
| r | 296410 | 6.6% |
| o | 286876 | 6.4% |
| n | 247897 | 5.5% |
| i | 230699 | 5.1% |
| s | 199096 | 4.4% |
| t | 171925 | 3.8% |
| l | 166222 | 3.7% |
| Other values (623) | 1732103 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 3155292 | |
| Uppercase Letter | 689398 | 15.4% |
| Space Separator | 442359 | 9.9% |
| Other Punctuation | 172153 | 3.8% |
| Dash Punctuation | 10725 | 0.2% |
| Decimal Number | 7074 | 0.2% |
| Close Punctuation | 1282 | < 0.1% |
| Open Punctuation | 1282 | < 0.1% |
| Other Letter | 982 | < 0.1% |
| Math Symbol | 298 | < 0.1% |
| Other values (8) | 312 | < 0.1% |
Most frequent character per category
Other Letter
| Value | Count | Frequency (%) |
| º | 46 | 4.7% |
| า | 42 | 4.3% |
| ا | 41 | 4.2% |
| ม | 37 | 3.8% |
| ل | 23 | 2.3% |
| ร | 23 | 2.3% |
| و | 22 | 2.2% |
| ต | 19 | 1.9% |
| ن | 19 | 1.9% |
| แ | 18 | 1.8% |
| Other values (314) | 692 |
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 380059 | |
| a | 327513 | |
| r | 296410 | |
| o | 286876 | |
| n | 247897 | 7.9% |
| i | 230699 | 7.3% |
| s | 199096 | 6.3% |
| t | 171925 | 5.4% |
| l | 166222 | 5.3% |
| c | 133205 | 4.2% |
| Other values (133) | 715390 |
Uppercase Letter
| Value | Count | Frequency (%) |
| C | 74077 | 10.7% |
| S | 59861 | 8.7% |
| F | 49978 | 7.2% |
| M | 49416 | 7.2% |
| I | 47411 | 6.9% |
| B | 44253 | 6.4% |
| L | 42357 | 6.1% |
| P | 35235 | 5.1% |
| A | 31592 | 4.6% |
| T | 29764 | 4.3% |
| Other values (93) | 225454 |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 57999 | |
| , | 57382 | |
| ' | 29445 | |
| / | 13588 | 7.9% |
| & | 8904 | 5.2% |
| : | 3333 | 1.9% |
| ! | 1183 | 0.7% |
| " | 150 | 0.1% |
| % | 34 | < 0.1% |
| ; | 31 | < 0.1% |
| Other values (9) | 104 | 0.1% |
Decimal Number
| Value | Count | Frequency (%) |
| 3 | 1340 | |
| 5 | 1239 | |
| 6 | 1108 | |
| 1 | 675 | |
| 2 | 656 | |
| 0 | 626 | |
| 7 | 510 | 7.2% |
| 4 | 385 | 5.4% |
| 9 | 285 | 4.0% |
| 8 | 250 | 3.5% |
Nonspacing Mark
| Value | Count | Frequency (%) |
| ่ | 23 | |
| ้ | 12 | |
| ิ | 12 | |
| ์ | 8 | 11.9% |
| ี | 6 | 9.0% |
| ั | 4 | 6.0% |
| ́ | 1 | 1.5% |
| ู | 1 | 1.5% |
Space Separator
| Value | Count | Frequency (%) |
| 442357 | ||
| 1 | < 0.1% | |
| 1 | < 0.1% |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 1268 | |
| ] | 12 | 0.9% |
| ) | 2 | 0.2% |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 1267 | |
| [ | 11 | 0.9% |
| ( | 4 | 0.3% |
Math Symbol
| Value | Count | Frequency (%) |
| + | 295 | |
| | | 2 | 0.7% |
| ~ | 1 | 0.3% |
Other Symbol
| Value | Count | Frequency (%) |
| ® | 20 | |
| ° | 6 | 19.4% |
| № | 5 | 16.1% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 10718 | |
| — | 7 | 0.1% |
Currency Symbol
| Value | Count | Frequency (%) |
| $ | 71 | |
| € | 11 | 13.4% |
Final Punctuation
| Value | Count | Frequency (%) |
| » | 45 | |
| ’ | 9 | 16.7% |
Modifier Symbol
| Value | Count | Frequency (%) |
| ´ | 12 | |
| ` | 10 |
Initial Punctuation
| Value | Count | Frequency (%) |
| « | 47 |
Other Number
| Value | Count | Frequency (%) |
| ³ | 6 |
Modifier Letter
| Value | Count | Frequency (%) |
| ー | 3 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 3833890 | |
| Common | 635418 | 14.2% |
| Cyrillic | 10645 | 0.2% |
| Thai | 330 | < 0.1% |
| Han | 245 | < 0.1% |
| Arabic | 219 | < 0.1% |
| Greek | 201 | < 0.1% |
| Hebrew | 73 | < 0.1% |
| Katakana | 59 | < 0.1% |
| Hangul | 53 | < 0.1% |
| Other values (2) | 24 | < 0.1% |
Most frequent character per script
Han
| Value | Count | Frequency (%) |
| 旺 | 6 | 2.4% |
| 可 | 5 | 2.0% |
| 乐 | 5 | 2.0% |
| 品 | 5 | 2.0% |
| 雪 | 4 | 1.6% |
| 好 | 4 | 1.6% |
| 生 | 3 | 1.2% |
| 光 | 3 | 1.2% |
| 治 | 3 | 1.2% |
| 明 | 3 | 1.2% |
| Other values (146) | 204 |
Latin
| Value | Count | Frequency (%) |
| e | 380059 | 9.9% |
| a | 327513 | 8.5% |
| r | 296410 | 7.7% |
| o | 286876 | 7.5% |
| n | 247897 | 6.5% |
| i | 230699 | 6.0% |
| s | 199096 | 5.2% |
| t | 171925 | 4.5% |
| l | 166222 | 4.3% |
| c | 133205 | 3.5% |
| Other values (124) | 1393988 |
Cyrillic
| Value | Count | Frequency (%) |
| о | 946 | 8.9% |
| а | 919 | 8.6% |
| е | 794 | 7.5% |
| р | 675 | 6.3% |
| н | 650 | 6.1% |
| и | 616 | 5.8% |
| к | 566 | 5.3% |
| с | 506 | 4.8% |
| т | 373 | 3.5% |
| л | 366 | 3.4% |
| Other values (53) | 4234 |
Common
| Value | Count | Frequency (%) |
| 442357 | ||
| . | 57999 | 9.1% |
| , | 57382 | 9.0% |
| ' | 29445 | 4.6% |
| / | 13588 | 2.1% |
| - | 10718 | 1.7% |
| & | 8904 | 1.4% |
| : | 3333 | 0.5% |
| 3 | 1340 | 0.2% |
| ) | 1268 | 0.2% |
| Other values (45) | 9084 | 1.4% |
Greek
| Value | Count | Frequency (%) |
| ν | 17 | 8.5% |
| α | 16 | 8.0% |
| ο | 11 | 5.5% |
| Α | 10 | 5.0% |
| ς | 8 | 4.0% |
| τ | 8 | 4.0% |
| Κ | 8 | 4.0% |
| κ | 7 | 3.5% |
| ρ | 7 | 3.5% |
| ι | 7 | 3.5% |
| Other values (40) | 102 |
Hangul
| Value | Count | Frequency (%) |
| 설 | 2 | 3.8% |
| 오 | 2 | 3.8% |
| 리 | 2 | 3.8% |
| 농 | 2 | 3.8% |
| 심 | 2 | 3.8% |
| 자 | 2 | 3.8% |
| 샤 | 1 | 1.9% |
| 이 | 1 | 1.9% |
| 칠 | 1 | 1.9% |
| 성 | 1 | 1.9% |
| Other values (37) | 37 |
Thai
| Value | Count | Frequency (%) |
| า | 42 | 12.7% |
| ม | 37 | 11.2% |
| ่ | 23 | 7.0% |
| ร | 23 | 7.0% |
| ต | 19 | 5.8% |
| แ | 18 | 5.5% |
| ้ | 12 | 3.6% |
| อ | 12 | 3.6% |
| ล | 12 | 3.6% |
| ิ | 12 | 3.6% |
| Other values (25) | 120 |
Katakana
| Value | Count | Frequency (%) |
| ス | 5 | 8.5% |
| グ | 4 | 6.8% |
| リ | 4 | 6.8% |
| ア | 4 | 6.8% |
| ル | 4 | 6.8% |
| マ | 3 | 5.1% |
| ン | 3 | 5.1% |
| ッ | 3 | 5.1% |
| メ | 2 | 3.4% |
| ミ | 2 | 3.4% |
| Other values (20) | 25 |
Arabic
| Value | Count | Frequency (%) |
| ا | 41 | |
| ل | 23 | |
| و | 22 | |
| ن | 19 | |
| ر | 18 | |
| ي | 17 | |
| ك | 16 | 7.3% |
| س | 8 | 3.7% |
| ف | 8 | 3.7% |
| ب | 7 | 3.2% |
| Other values (16) | 40 |
Hebrew
| Value | Count | Frequency (%) |
| ו | 9 | |
| י | 7 | 9.6% |
| ב | 6 | 8.2% |
| ר | 6 | 8.2% |
| ק | 5 | 6.8% |
| מ | 5 | 6.8% |
| ת | 5 | 6.8% |
| פ | 4 | 5.5% |
| ה | 4 | 5.5% |
| נ | 4 | 5.5% |
| Other values (11) | 18 |
Hiragana
| Value | Count | Frequency (%) |
| お | 3 | |
| ん | 3 | |
| や | 3 | |
| き | 2 | 8.7% |
| た | 2 | 8.7% |
| り | 1 | 4.3% |
| な | 1 | 4.3% |
| が | 1 | 4.3% |
| に | 1 | 4.3% |
| え | 1 | 4.3% |
| Other values (5) | 5 |
Inherited
| Value | Count | Frequency (%) |
| ́ | 1 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 4449117 | |
| None | 20347 | 0.5% |
| Cyrillic | 10645 | 0.2% |
| Thai | 330 | < 0.1% |
| CJK | 245 | < 0.1% |
| Arabic | 220 | < 0.1% |
| Hebrew | 73 | < 0.1% |
| Katakana | 62 | < 0.1% |
| Hangul | 53 | < 0.1% |
| Punctuation | 25 | < 0.1% |
| Other values (4) | 40 | < 0.1% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 442357 | 9.9% | |
| e | 380059 | 8.5% |
| a | 327513 | 7.4% |
| r | 296410 | 6.7% |
| o | 286876 | 6.4% |
| n | 247897 | 5.6% |
| i | 230699 | 5.2% |
| s | 199096 | 4.5% |
| t | 171925 | 3.9% |
| l | 166222 | 3.7% |
| Other values (77) | 1700063 |
None
| Value | Count | Frequency (%) |
| é | 11056 | |
| è | 3546 | 17.4% |
| ü | 729 | 3.6% |
| ó | 573 | 2.8% |
| í | 455 | 2.2% |
| ô | 429 | 2.1% |
| â | 388 | 1.9% |
| ä | 325 | 1.6% |
| ê | 296 | 1.5% |
| î | 283 | 1.4% |
| Other values (134) | 2267 | 11.1% |
Cyrillic
| Value | Count | Frequency (%) |
| о | 946 | 8.9% |
| а | 919 | 8.6% |
| е | 794 | 7.5% |
| р | 675 | 6.3% |
| н | 650 | 6.1% |
| и | 616 | 5.8% |
| к | 566 | 5.3% |
| с | 506 | 4.8% |
| т | 373 | 3.5% |
| л | 366 | 3.4% |
| Other values (53) | 4234 |
Thai
| Value | Count | Frequency (%) |
| า | 42 | 12.7% |
| ม | 37 | 11.2% |
| ่ | 23 | 7.0% |
| ร | 23 | 7.0% |
| ต | 19 | 5.8% |
| แ | 18 | 5.5% |
| ้ | 12 | 3.6% |
| อ | 12 | 3.6% |
| ล | 12 | 3.6% |
| ิ | 12 | 3.6% |
| Other values (25) | 120 |
Arabic
| Value | Count | Frequency (%) |
| ا | 41 | |
| ل | 23 | |
| و | 22 | |
| ن | 19 | |
| ر | 18 | |
| ي | 17 | |
| ك | 16 | 7.3% |
| س | 8 | 3.6% |
| ف | 8 | 3.6% |
| ب | 7 | 3.2% |
| Other values (17) | 41 |
Currency Symbols
| Value | Count | Frequency (%) |
| € | 11 |
Hebrew
| Value | Count | Frequency (%) |
| ו | 9 | |
| י | 7 | 9.6% |
| ב | 6 | 8.2% |
| ר | 6 | 8.2% |
| ק | 5 | 6.8% |
| מ | 5 | 6.8% |
| ת | 5 | 6.8% |
| פ | 4 | 5.5% |
| ה | 4 | 5.5% |
| נ | 4 | 5.5% |
| Other values (11) | 18 |
Punctuation
| Value | Count | Frequency (%) |
| ’ | 9 | |
| … | 7 | |
| — | 7 | |
| • | 2 | 8.0% |
CJK
| Value | Count | Frequency (%) |
| 旺 | 6 | 2.4% |
| 可 | 5 | 2.0% |
| 乐 | 5 | 2.0% |
| 品 | 5 | 2.0% |
| 雪 | 4 | 1.6% |
| 好 | 4 | 1.6% |
| 生 | 3 | 1.2% |
| 光 | 3 | 1.2% |
| 治 | 3 | 1.2% |
| 明 | 3 | 1.2% |
| Other values (146) | 204 |
Letterlike Symbols
| Value | Count | Frequency (%) |
| № | 5 |
Katakana
| Value | Count | Frequency (%) |
| ス | 5 | 8.1% |
| グ | 4 | 6.5% |
| リ | 4 | 6.5% |
| ア | 4 | 6.5% |
| ル | 4 | 6.5% |
| マ | 3 | 4.8% |
| ン | 3 | 4.8% |
| ッ | 3 | 4.8% |
| ー | 3 | 4.8% |
| メ | 2 | 3.2% |
| Other values (21) | 27 |
Hiragana
| Value | Count | Frequency (%) |
| お | 3 | |
| ん | 3 | |
| や | 3 | |
| き | 2 | 8.7% |
| た | 2 | 8.7% |
| り | 1 | 4.3% |
| な | 1 | 4.3% |
| が | 1 | 4.3% |
| に | 1 | 4.3% |
| え | 1 | 4.3% |
| Other values (5) | 5 |
Hangul
| Value | Count | Frequency (%) |
| 설 | 2 | 3.8% |
| 오 | 2 | 3.8% |
| 리 | 2 | 3.8% |
| 농 | 2 | 3.8% |
| 심 | 2 | 3.8% |
| 자 | 2 | 3.8% |
| 샤 | 1 | 1.9% |
| 이 | 1 | 1.9% |
| 칠 | 1 | 1.9% |
| 성 | 1 | 1.9% |
| Other values (37) | 37 |
Diacriticals
| Value | Count | Frequency (%) |
| ́ | 1 |
energy_100g
Real number (ℝ)
MISSING  SKEWED  ZEROS 
| Distinct | 3997 |
|---|---|
| Distinct (%) | 1.5% |
| Missing | 59659 |
| Missing (%) | 18.6% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1141.9146 |
| Minimum | 0 |
|---|---|
| Maximum | 3251373 |
| Zeros | 8909 |
| Zeros (%) | 2.8% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.4 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 71 |
| Q1 | 377 |
| median | 1100 |
| Q3 | 1674 |
| 95-th percentile | 2389 |
| Maximum | 3251373 |
| Range | 3251373 |
| Interquartile range (IQR) | 1297 |
Descriptive statistics
| Standard deviation | 6447.1541 |
|---|---|
| Coefficient of variation (CV) | 5.6459161 |
| Kurtosis | 247388.17 |
| Mean | 1141.9146 |
| Median Absolute Deviation (MAD) | 657 |
| Skewness | 491.00398 |
| Sum | 2.9816875 × 108 |
| Variance | 41565796 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 8909 | 2.8% |
| 2092 | 5075 | 1.6% |
| 1674 | 4012 | 1.3% |
| 1494 | 3916 | 1.2% |
| 1644 | 3282 | 1.0% |
| 1393 | 3225 | 1.0% |
| 1046 | 2945 | 0.9% |
| 1569 | 2825 | 0.9% |
| 1795 | 2350 | 0.7% |
| 1197 | 2314 | 0.7% |
| Other values (3987) | 222260 | |
| (Missing) | 59659 | 18.6% |
| Value | Count | Frequency (%) |
| 0 | 8909 | |
| 0.02 | 1 | < 0.1% |
| 0.42 | 1 | < 0.1% |
| 0.48 | 1 | < 0.1% |
| 0.6 | 1 | < 0.1% |
| 0.8 | 7 | < 0.1% |
| 0.9 | 4 | < 0.1% |
| 0.92 | 4 | < 0.1% |
| 1 | 50 | < 0.1% |
| 1.1 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 3251373 | 1 | |
| 231199 | 1 | |
| 182764 | 1 | |
| 110579 | 1 | |
| 94140 | 1 | |
| 87217 | 1 | |
| 69292 | 1 | |
| 26861 | 1 | |
| 22000 | 1 | |
| 18700 | 1 |
salt_100g
Real number (ℝ)
MISSING  SKEWED  ZEROS 
| Distinct | 5586 |
|---|---|
| Distinct (%) | 2.2% |
| Missing | 65262 |
| Missing (%) | 20.3% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.0286239 |
| Minimum | 0 |
|---|---|
| Maximum | 64312.8 |
| Zeros | 34174 |
| Zeros (%) | 10.7% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.4 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0.0635 |
| median | 0.58166 |
| Q3 | 1.37414 |
| 95-th percentile | 4.064 |
| Maximum | 64312.8 |
| Range | 64312.8 |
| Interquartile range (IQR) | 1.31064 |
Descriptive statistics
| Standard deviation | 128.26945 |
|---|---|
| Coefficient of variation (CV) | 63.229784 |
| Kurtosis | 247314.47 |
| Mean | 2.0286239 |
| Median Absolute Deviation (MAD) | 0.55666 |
| Skewness | 493.50379 |
| Sum | 518333.71 |
| Variance | 16453.053 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 34174 | 10.7% |
| 0.01 | 3692 | 1.2% |
| 0.1 | 3467 | 1.1% |
| 1 | 2231 | 0.7% |
| 0.0254 | 2093 | 0.7% |
| 1.27 | 1941 | 0.6% |
| 1.63322 | 1825 | 0.6% |
| 0.127 | 1779 | 0.6% |
| 0.03 | 1636 | 0.5% |
| 1.3 | 1551 | 0.5% |
| Other values (5576) | 201121 | |
| (Missing) | 65262 | 20.3% |
| Value | Count | Frequency (%) |
| 0 | 34174 | |
| 5 × 10-8 | 1 | < 0.1% |
| 9.999999 × 10-8 | 2 | < 0.1% |
| 1 × 10-6 | 1 | < 0.1% |
| 5 × 10-6 | 1 | < 0.1% |
| 7.874 × 10-6 | 1 | < 0.1% |
| 1 × 10-5 | 5 | < 0.1% |
| 1.3 × 10-5 | 4 | < 0.1% |
| 2 × 10-5 | 1 | < 0.1% |
| 2.413 × 10-5 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 64312.8 | 1 | |
| 3556 | 1 | |
| 3048 | 1 | |
| 2452.41318 | 1 | |
| 2177.14322 | 1 | |
| 2032 | 1 | |
| 1799.16582 | 1 | |
| 1669.14322 | 1 | |
| 1318.38192 | 1 | |
| 1139.1519 | 1 |
sodium_100g
Real number (ℝ)
MISSING  SKEWED  ZEROS 
| Distinct | 5291 |
|---|---|
| Distinct (%) | 2.1% |
| Missing | 65309 |
| Missing (%) | 20.4% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.79881546 |
| Minimum | 0 |
|---|---|
| Maximum | 25320 |
| Zeros | 34131 |
| Zeros (%) | 10.6% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.4 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0.025 |
| median | 0.229 |
| Q3 | 0.541 |
| 95-th percentile | 1.6 |
| Maximum | 25320 |
| Range | 25320 |
| Interquartile range (IQR) | 0.516 |
Descriptive statistics
| Standard deviation | 50.504428 |
|---|---|
| Coefficient of variation (CV) | 63.22415 |
| Kurtosis | 247269.02 |
| Mean | 0.79881546 |
| Median Absolute Deviation (MAD) | 0.21915748 |
| Skewness | 493.45847 |
| Sum | 204067.79 |
| Variance | 2550.6972 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 34131 | 10.6% |
| 0.003937007874 | 3687 | 1.1% |
| 0.03937007874 | 3451 | 1.1% |
| 0.3937007874 | 2216 | 0.7% |
| 0.01 | 2092 | 0.7% |
| 0.5 | 1939 | 0.6% |
| 0.01181102362 | 1927 | 0.6% |
| 0.643 | 1848 | 0.6% |
| 0.05 | 1779 | 0.6% |
| 0.5118110236 | 1545 | 0.5% |
| Other values (5281) | 200848 | |
| (Missing) | 65309 | 20.4% |
| Value | Count | Frequency (%) |
| 0 | 34131 | |
| 1.968503937 × 10-8 | 1 | < 0.1% |
| 3.93700748 × 10-8 | 2 | < 0.1% |
| 3.937007874 × 10-7 | 1 | < 0.1% |
| 1.968503937 × 10-6 | 1 | < 0.1% |
| 3.1 × 10-6 | 1 | < 0.1% |
| 3.937007874 × 10-6 | 5 | < 0.1% |
| 5.118110236 × 10-6 | 4 | < 0.1% |
| 7.874015748 × 10-6 | 1 | < 0.1% |
| 9.5 × 10-6 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 25320 | 1 | |
| 1400 | 1 | |
| 1200 | 1 | |
| 965.517 | 1 | |
| 857.143 | 1 | |
| 800 | 1 | |
| 708.333 | 1 | |
| 657.143 | 1 | |
| 519.048 | 1 | |
| 448.485 | 1 |
fiber_100g
Real number (ℝ)
MISSING  SKEWED  ZEROS 
| Distinct | 1016 |
|---|---|
| Distinct (%) | 0.5% |
| Missing | 119886 |
| Missing (%) | 37.4% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.8621109 |
| Minimum | -6.7 |
|---|---|
| Maximum | 5380 |
| Zeros | 68833 |
| Zeros (%) | 21.5% |
| Negative | 1 |
| Negative (%) | < 0.1% |
| Memory size | 2.4 MiB |
Quantile statistics
| Minimum | -6.7 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 1.5 |
| Q3 | 3.6 |
| 95-th percentile | 10.5 |
| Maximum | 5380 |
| Range | 5386.7 |
| Interquartile range (IQR) | 3.6 |
Descriptive statistics
| Standard deviation | 12.867578 |
|---|---|
| Coefficient of variation (CV) | 4.4958348 |
| Kurtosis | 151802.73 |
| Mean | 2.8621109 |
| Median Absolute Deviation (MAD) | 1.5 |
| Skewness | 363.54781 |
| Sum | 574958.02 |
| Variance | 165.57456 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 68833 | |
| 3.6 | 8525 | 2.7% |
| 3.3 | 3991 | 1.2% |
| 1.8 | 3886 | 1.2% |
| 0.8 | 3829 | 1.2% |
| 7.1 | 3707 | 1.2% |
| 2 | 3531 | 1.1% |
| 1.6 | 3428 | 1.1% |
| 0.5 | 3419 | 1.1% |
| 1.2 | 3278 | 1.0% |
| Other values (1006) | 94459 | |
| (Missing) | 119886 |
| Value | Count | Frequency (%) |
| -6.7 | 1 | < 0.1% |
| 0 | 68833 | |
| 0.0001 | 2 | < 0.1% |
| 0.0002 | 1 | < 0.1% |
| 0.001 | 16 | < 0.1% |
| 0.002 | 3 | < 0.1% |
| 0.004 | 1 | < 0.1% |
| 0.00416 | 1 | < 0.1% |
| 0.005 | 2 | < 0.1% |
| 0.01 | 72 | < 0.1% |
| Value | Count | Frequency (%) |
| 5380 | 1 | < 0.1% |
| 250 | 1 | < 0.1% |
| 178 | 1 | < 0.1% |
| 166.7 | 1 | < 0.1% |
| 100 | 10 | |
| 99 | 1 | < 0.1% |
| 94.8 | 1 | < 0.1% |
| 92.4 | 1 | < 0.1% |
| 90 | 1 | < 0.1% |
| 88 | 2 | < 0.1% |
additives_n
Real number (ℝ)
MISSING  ZEROS 
| Distinct | 31 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 71833 |
| Missing (%) | 22.4% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.9360245 |
| Minimum | 0 |
|---|---|
| Maximum | 31 |
| Zeros | 94259 |
| Zeros (%) | 29.4% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.4 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 1 |
| Q3 | 3 |
| 95-th percentile | 7 |
| Maximum | 31 |
| Range | 31 |
| Interquartile range (IQR) | 3 |
Descriptive statistics
| Standard deviation | 2.5020195 |
|---|---|
| Coefficient of variation (CV) | 1.2923491 |
| Kurtosis | 7.4179254 |
| Mean | 1.9360245 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 2.1753736 |
| Sum | 481952 |
| Variance | 6.2601014 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 94259 | |
| 1 | 46509 | |
| 2 | 36520 | 11.4% |
| 3 | 23680 | 7.4% |
| 4 | 15243 | 4.8% |
| 5 | 10935 | 3.4% |
| 6 | 7290 | 2.3% |
| 7 | 4702 | 1.5% |
| 8 | 3359 | 1.0% |
| 9 | 2194 | 0.7% |
| Other values (21) | 4248 | 1.3% |
| (Missing) | 71833 |
| Value | Count | Frequency (%) |
| 0 | 94259 | |
| 1 | 46509 | |
| 2 | 36520 | 11.4% |
| 3 | 23680 | 7.4% |
| 4 | 15243 | 4.8% |
| 5 | 10935 | 3.4% |
| 6 | 7290 | 2.3% |
| 7 | 4702 | 1.5% |
| 8 | 3359 | 1.0% |
| 9 | 2194 | 0.7% |
| Value | Count | Frequency (%) |
| 31 | 4 | < 0.1% |
| 29 | 2 | < 0.1% |
| 28 | 2 | < 0.1% |
| 27 | 2 | < 0.1% |
| 26 | 3 | < 0.1% |
| 25 | 11 | |
| 24 | 10 | < 0.1% |
| 23 | 15 | |
| 22 | 27 | |
| 21 | 21 |
sugars_100g
Real number (ℝ)
MISSING  ZEROS 
| Distinct | 4068 |
|---|---|
| Distinct (%) | 1.7% |
| Missing | 75801 |
| Missing (%) | 23.6% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 16.003484 |
| Minimum | -17.86 |
|---|---|
| Maximum | 3520 |
| Zeros | 37077 |
| Zeros (%) | 11.6% |
| Negative | 7 |
| Negative (%) | < 0.1% |
| Memory size | 2.4 MiB |
Quantile statistics
| Minimum | -17.86 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 1.3 |
| median | 5.71 |
| Q3 | 24 |
| 95-th percentile | 62.5 |
| Maximum | 3520 |
| Range | 3537.86 |
| Interquartile range (IQR) | 22.7 |
Descriptive statistics
| Standard deviation | 22.327284 |
|---|---|
| Coefficient of variation (CV) | 1.3951515 |
| Kurtosis | 2477.5694 |
| Mean | 16.003484 |
| Median Absolute Deviation (MAD) | 5.71 |
| Skewness | 17.201619 |
| Sum | 3920389.4 |
| Variance | 498.50763 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 37077 | 11.6% |
| 3.57 | 7148 | 2.2% |
| 0.5 | 4589 | 1.4% |
| 3.33 | 3706 | 1.2% |
| 1 | 2666 | 0.8% |
| 20 | 2347 | 0.7% |
| 6.67 | 2269 | 0.7% |
| 10 | 2192 | 0.7% |
| 50 | 2129 | 0.7% |
| 2 | 2042 | 0.6% |
| Other values (4058) | 178806 | |
| (Missing) | 75801 |
| Value | Count | Frequency (%) |
| -17.86 | 1 | < 0.1% |
| -6.67 | 1 | < 0.1% |
| -6.25 | 1 | < 0.1% |
| -3.57 | 1 | < 0.1% |
| -1.2 | 1 | < 0.1% |
| -0.8 | 1 | < 0.1% |
| -0.1 | 1 | < 0.1% |
| 0 | 37077 | |
| 0.0001 | 8 | < 0.1% |
| 0.0005 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 3520 | 1 | < 0.1% |
| 166.67 | 1 | < 0.1% |
| 134 | 1 | < 0.1% |
| 110.71 | 1 | < 0.1% |
| 105 | 1 | < 0.1% |
| 104 | 1 | < 0.1% |
| 103.5 | 4 | < 0.1% |
| 103 | 1 | < 0.1% |
| 100.8 | 1 | < 0.1% |
| 100 | 1011 |
fat_100g
Real number (ℝ)
MISSING  ZEROS 
| Distinct | 3378 |
|---|---|
| Distinct (%) | 1.4% |
| Missing | 76881 |
| Missing (%) | 24.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 12.730379 |
| Minimum | 0 |
|---|---|
| Maximum | 714.29 |
| Zeros | 64504 |
| Zeros (%) | 20.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.4 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 5 |
| Q3 | 20 |
| 95-th percentile | 46.43 |
| Maximum | 714.29 |
| Range | 714.29 |
| Interquartile range (IQR) | 20 |
Descriptive statistics
| Standard deviation | 17.578747 |
|---|---|
| Coefficient of variation (CV) | 1.3808503 |
| Kurtosis | 17.184558 |
| Mean | 12.730379 |
| Median Absolute Deviation (MAD) | 5 |
| Skewness | 2.4647045 |
| Sum | 3104824.8 |
| Variance | 309.01234 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 64504 | |
| 25 | 3409 | 1.1% |
| 0.5 | 3202 | 1.0% |
| 32.14 | 2981 | 0.9% |
| 20 | 2688 | 0.8% |
| 1.79 | 2528 | 0.8% |
| 28.57 | 2460 | 0.8% |
| 0.1 | 2437 | 0.8% |
| 21.43 | 2411 | 0.8% |
| 10 | 2284 | 0.7% |
| Other values (3368) | 154987 | |
| (Missing) | 76881 |
| Value | Count | Frequency (%) |
| 0 | 64504 | |
| 0.0001 | 2 | < 0.1% |
| 0.000133 | 1 | < 0.1% |
| 0.001 | 1 | < 0.1% |
| 0.003 | 1 | < 0.1% |
| 0.004 | 2 | < 0.1% |
| 0.005 | 3 | < 0.1% |
| 0.007 | 1 | < 0.1% |
| 0.01 | 43 | < 0.1% |
| 0.012 | 2 | < 0.1% |
| Value | Count | Frequency (%) |
| 714.29 | 1 | < 0.1% |
| 380 | 1 | < 0.1% |
| 105 | 1 | < 0.1% |
| 101 | 1 | < 0.1% |
| 100 | 1288 | |
| 99.9 | 16 | < 0.1% |
| 99.85 | 1 | < 0.1% |
| 99.82 | 1 | < 0.1% |
| 99.8 | 17 | < 0.1% |
| 99.7 | 5 | < 0.1% |
saturated_fat_100g
Real number (ℝ)
MISSING  ZEROS 
| Distinct | 2197 |
|---|---|
| Distinct (%) | 1.0% |
| Missing | 91218 |
| Missing (%) | 28.4% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5.1299323 |
| Minimum | 0 |
|---|---|
| Maximum | 550 |
| Zeros | 68736 |
| Zeros (%) | 21.4% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.4 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 1.79 |
| Q3 | 7.14 |
| 95-th percentile | 20 |
| Maximum | 550 |
| Range | 550 |
| Interquartile range (IQR) | 7.14 |
Descriptive statistics
| Standard deviation | 8.0142381 |
|---|---|
| Coefficient of variation (CV) | 1.5622503 |
| Kurtosis | 116.64216 |
| Mean | 5.1299323 |
| Median Absolute Deviation (MAD) | 1.79 |
| Skewness | 4.8175969 |
| Sum | 1177596.5 |
| Variance | 64.228013 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 68736 | |
| 0.1 | 5355 | 1.7% |
| 3.57 | 3487 | 1.1% |
| 0.5 | 3302 | 1.0% |
| 7.14 | 2880 | 0.9% |
| 0.2 | 2601 | 0.8% |
| 1 | 2444 | 0.8% |
| 0.3 | 2335 | 0.7% |
| 3.33 | 2213 | 0.7% |
| 1.79 | 2190 | 0.7% |
| Other values (2187) | 134011 | |
| (Missing) | 91218 |
| Value | Count | Frequency (%) |
| 0 | 68736 | |
| 0.0001 | 11 | < 0.1% |
| 0.001 | 30 | < 0.1% |
| 0.002 | 10 | < 0.1% |
| 0.003 | 4 | < 0.1% |
| 0.0032 | 1 | < 0.1% |
| 0.004 | 3 | < 0.1% |
| 0.005 | 11 | < 0.1% |
| 0.006 | 2 | < 0.1% |
| 0.00667 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 550 | 1 | < 0.1% |
| 210 | 1 | < 0.1% |
| 175.38 | 1 | < 0.1% |
| 100 | 12 | |
| 99.9 | 1 | < 0.1% |
| 99 | 2 | < 0.1% |
| 98 | 1 | < 0.1% |
| 96 | 2 | < 0.1% |
| 95.5 | 1 | < 0.1% |
| 95 | 5 |
nutrition_score_uk_100g
Real number (ℝ)
MISSING  ZEROS 
| Distinct | 55 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 99562 |
| Missing (%) | 31.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 9.0580489 |
| Minimum | -15 |
|---|---|
| Maximum | 40 |
| Zeros | 13588 |
| Zeros (%) | 4.2% |
| Negative | 37361 |
| Negative (%) | 11.6% |
| Memory size | 2.4 MiB |
Quantile statistics
| Minimum | -15 |
|---|---|
| 5-th percentile | -5 |
| Q1 | 1 |
| median | 9 |
| Q3 | 16 |
| 95-th percentile | 24 |
| Maximum | 40 |
| Range | 55 |
| Interquartile range (IQR) | 15 |
Descriptive statistics
| Standard deviation | 9.1835893 |
|---|---|
| Coefficient of variation (CV) | 1.0138595 |
| Kurtosis | -1.0755201 |
| Mean | 9.0580489 |
| Median Absolute Deviation (MAD) | 8 |
| Skewness | 0.1320062 |
| Sum | 2003731 |
| Variance | 84.338312 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 13588 | 4.2% |
| 1 | 11932 | 3.7% |
| 2 | 11083 | 3.5% |
| 14 | 10689 | 3.3% |
| -1 | 8827 | 2.8% |
| 13 | 8409 | 2.6% |
| 12 | 8239 | 2.6% |
| 11 | 8093 | 2.5% |
| 3 | 7620 | 2.4% |
| 20 | 7390 | 2.3% |
| Other values (45) | 125340 | |
| (Missing) | 99562 |
| Value | Count | Frequency (%) |
| -15 | 1 | < 0.1% |
| -14 | 5 | < 0.1% |
| -13 | 23 | < 0.1% |
| -12 | 46 | < 0.1% |
| -11 | 90 | < 0.1% |
| -10 | 157 | < 0.1% |
| -9 | 315 | 0.1% |
| -8 | 602 | 0.2% |
| -7 | 963 | 0.3% |
| -6 | 4926 |
| Value | Count | Frequency (%) |
| 40 | 3 | < 0.1% |
| 38 | 1 | < 0.1% |
| 37 | 2 | < 0.1% |
| 36 | 17 | < 0.1% |
| 35 | 34 | < 0.1% |
| 34 | 20 | < 0.1% |
| 33 | 101 | |
| 32 | 64 | < 0.1% |
| 31 | 76 | < 0.1% |
| 30 | 192 |
nutrition_score_fr_100g
Real number (ℝ)
MISSING  ZEROS 
| Distinct | 55 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 99562 |
| Missing (%) | 31.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 9.165535 |
| Minimum | -15 |
|---|---|
| Maximum | 40 |
| Zeros | 12763 |
| Zeros (%) | 4.0% |
| Negative | 35706 |
| Negative (%) | 11.1% |
| Memory size | 2.4 MiB |
Quantile statistics
| Minimum | -15 |
|---|---|
| 5-th percentile | -5 |
| Q1 | 1 |
| median | 10 |
| Q3 | 16 |
| 95-th percentile | 24 |
| Maximum | 40 |
| Range | 55 |
| Interquartile range (IQR) | 15 |
Descriptive statistics
| Standard deviation | 9.0559029 |
|---|---|
| Coefficient of variation (CV) | 0.98803866 |
| Kurtosis | -1.0188856 |
| Mean | 9.165535 |
| Median Absolute Deviation (MAD) | 8 |
| Skewness | 0.11483636 |
| Sum | 2027508 |
| Variance | 82.009378 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 12763 | 4.0% |
| 1 | 11268 | 3.5% |
| 14 | 11253 | 3.5% |
| 2 | 10604 | 3.3% |
| 13 | 8827 | 2.8% |
| -1 | 8804 | 2.7% |
| 12 | 8658 | 2.7% |
| 11 | 8653 | 2.7% |
| 3 | 7857 | 2.4% |
| 15 | 7529 | 2.3% |
| Other values (45) | 124994 | |
| (Missing) | 99562 |
| Value | Count | Frequency (%) |
| -15 | 1 | < 0.1% |
| -14 | 5 | < 0.1% |
| -13 | 23 | < 0.1% |
| -12 | 46 | < 0.1% |
| -11 | 90 | < 0.1% |
| -10 | 159 | < 0.1% |
| -9 | 315 | 0.1% |
| -8 | 601 | 0.2% |
| -7 | 950 | 0.3% |
| -6 | 4925 |
| Value | Count | Frequency (%) |
| 40 | 4 | < 0.1% |
| 38 | 1 | < 0.1% |
| 37 | 3 | < 0.1% |
| 36 | 17 | < 0.1% |
| 35 | 36 | < 0.1% |
| 34 | 20 | < 0.1% |
| 33 | 105 | |
| 32 | 73 | < 0.1% |
| 31 | 79 | < 0.1% |
| 30 | 207 |
nutrition_grade_fr
Categorical
MISSING 
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 99562 |
| Missing (%) | 31.0% |
| Memory size | 313.6 KiB |
| d | |
|---|---|
| c | |
| e | |
| a | |
| b |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 221210 |
|---|---|
| Distinct characters | 5 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | d |
|---|---|
| 2nd row | b |
| 3rd row | d |
| 4th row | c |
| 5th row | d |
Common Values
| Value | Count | Frequency (%) |
| d | 62763 | |
| c | 45538 | |
| e | 43030 | |
| a | 35634 | 11.1% |
| b | 34245 | 10.7% |
| (Missing) | 99562 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| d | 62763 | |
| c | 45538 | |
| e | 43030 | |
| a | 35634 | |
| b | 34245 |
Most occurring characters
| Value | Count | Frequency (%) |
| d | 62763 | |
| c | 45538 | |
| e | 43030 | |
| a | 35634 | |
| b | 34245 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 221210 |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| d | 62763 | |
| c | 45538 | |
| e | 43030 | |
| a | 35634 | |
| b | 34245 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 221210 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| d | 62763 | |
| c | 45538 | |
| e | 43030 | |
| a | 35634 | |
| b | 34245 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 221210 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| d | 62763 | |
| c | 45538 | |
| e | 43030 | |
| a | 35634 | |
| b | 34245 |
cholesterol_100g
Real number (ℝ)
MISSING  SKEWED  ZEROS 
| Distinct | 537 |
|---|---|
| Distinct (%) | 0.4% |
| Missing | 176682 |
| Missing (%) | 55.1% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.020071383 |
| Minimum | 0 |
|---|---|
| Maximum | 95.238 |
| Zeros | 89441 |
| Zeros (%) | 27.9% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.4 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0.02 |
| 95-th percentile | 0.09 |
| Maximum | 95.238 |
| Range | 95.238 |
| Interquartile range (IQR) | 0.02 |
Descriptive statistics
| Standard deviation | 0.35806161 |
|---|---|
| Coefficient of variation (CV) | 17.839408 |
| Kurtosis | 51631.976 |
| Mean | 0.020071383 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 221.11781 |
| Sum | 2892.0856 |
| Variance | 0.12820811 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 89441 | |
| 0.071 | 2462 | 0.8% |
| 0.107 | 2237 | 0.7% |
| 0.012 | 1909 | 0.6% |
| 0.089 | 1664 | 0.5% |
| 0.054 | 1651 | 0.5% |
| 0.018 | 1591 | 0.5% |
| 0.004 | 1503 | 0.5% |
| 0.036 | 1386 | 0.4% |
| 0.008 | 1209 | 0.4% |
| Other values (527) | 39037 | 12.2% |
| (Missing) | 176682 |
| Value | Count | Frequency (%) |
| 0 | 89441 | |
| 4.5 × 10-5 | 1 | < 0.1% |
| 7.1 × 10-5 | 1 | < 0.1% |
| 0.0001 | 5 | < 0.1% |
| 0.0002 | 5 | < 0.1% |
| 0.0004 | 1 | < 0.1% |
| 0.000416 | 1 | < 0.1% |
| 0.00046 | 1 | < 0.1% |
| 0.0005 | 2 | < 0.1% |
| 0.0008 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 95.238 | 1 | |
| 70.588 | 1 | |
| 62.5 | 1 | |
| 13.846 | 1 | |
| 10.9 | 1 | |
| 1.58 | 1 | |
| 1.291 | 1 | |
| 1.25 | 1 | |
| 1.081 | 1 | |
| 0.996 | 1 |
| code | countries_fr | product_name | brands | energy_100g | salt_100g | sodium_100g | fiber_100g | additives_n | sugars_100g | fat_100g | saturated_fat_100g | nutrition_score_uk_100g | nutrition_score_fr_100g | nutrition_grade_fr | cholesterol_100g | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0000000003087 | France | Farine de blé noir | Ferme t'y R'nao | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 1 | 0000000004530 | États-Unis | Banana Chips Sweetened (Whole) | NaN | 2243.0 | 0.00000 | 0.000 | 3.6 | 0.0 | 14.29 | 28.57 | 28.57 | 14.0 | 14.0 | d | 0.018 |
| 2 | 0000000004559 | États-Unis | Peanuts | Torn & Glasser | 1941.0 | 0.63500 | 0.250 | 7.1 | 0.0 | 17.86 | 17.86 | 0.00 | 0.0 | 0.0 | b | 0.000 |
| 3 | 0000000016087 | États-Unis | Organic Salted Nut Mix | Grizzlies | 2540.0 | 1.22428 | 0.482 | 7.1 | 0.0 | 3.57 | 57.14 | 5.36 | 12.0 | 12.0 | d | NaN |
| 4 | 0000000016094 | États-Unis | Organic Polenta | Bob's Red Mill | 1552.0 | NaN | NaN | 5.7 | 0.0 | NaN | 1.43 | NaN | NaN | NaN | NaN | NaN |
| 5 | 0000000016100 | États-Unis | Breadshop Honey Gone Nuts Granola | Unfi | 1933.0 | NaN | NaN | 7.7 | 0.0 | 11.54 | 18.27 | 1.92 | NaN | NaN | NaN | NaN |
| 6 | 0000000016117 | États-Unis | Organic Long Grain White Rice | Lundberg | 1490.0 | NaN | NaN | NaN | 0.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 7 | 0000000016124 | États-Unis | Organic Muesli | Daddy's Muesli | 1833.0 | 0.13970 | 0.055 | 9.4 | 2.0 | 15.62 | 18.75 | 4.69 | 7.0 | 7.0 | c | NaN |
| 8 | 0000000016193 | États-Unis | Organic Dark Chocolate Minis | Equal Exchange | 2406.0 | NaN | NaN | 7.5 | 0.0 | 42.50 | 37.50 | 22.50 | NaN | NaN | NaN | NaN |
| 9 | 0000000016513 | États-Unis | Organic Sunflower Oil | Napa Valley Naturals | 3586.0 | NaN | NaN | NaN | 0.0 | NaN | 100.00 | 7.14 | NaN | NaN | NaN | NaN |
| code | countries_fr | product_name | brands | energy_100g | salt_100g | sodium_100g | fiber_100g | additives_n | sugars_100g | fat_100g | saturated_fat_100g | nutrition_score_uk_100g | nutrition_score_fr_100g | nutrition_grade_fr | cholesterol_100g | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 320762 | 9908278636246 | Pologne | Szprot w oleju roslinnym | EvraFish | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 320763 | 99111250 | France | Thé vert Earl grey | Lobodis | 21.0 | 0.0254 | 0.01 | 0.2 | 0.0 | 0.5 | 0.2 | 0.2 | 0.0 | 2.0 | c | NaN |
| 320764 | 9918 | France | Cheese cake thé vert, yuzu | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 320765 | 9935010000003 | France | Rillette d'oie | Sans marque,D.Lambert | NaN | NaN | NaN | NaN | 0.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 320766 | 99410148 | Royaume-Uni | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 320767 | 9948282780603 | Roumanie | Tomato & ricotta | Panzani | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 320768 | 99567453 | États-Unis | Mint Melange Tea A Blend Of Peppermint, Lemon Grass And Spearmint | Trader Joe's | 0.0 | 0.0000 | 0.00 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | b | 0.0 |
| 320769 | 9970229501521 | Chine | 乐吧泡菜味薯片 | 乐吧 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 320770 | 9980282863788 | France | Tomates aux Vermicelles | Knorr | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 320771 | 999990026839 | États-Unis | Sugar Free Drink Mix, Peach Tea | Market Pantry | 2092.0 | 0.0000 | 0.00 | NaN | 7.0 | 0.0 | 0.0 | NaN | NaN | NaN | NaN | NaN |
Most frequently occurring
| code | countries_fr | product_name | brands | energy_100g | salt_100g | sodium_100g | fiber_100g | additives_n | sugars_100g | fat_100g | saturated_fat_100g | nutrition_score_uk_100g | nutrition_score_fr_100g | nutrition_grade_fr | cholesterol_100g | # duplicates | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | NaN | en:fruit-yogurts | France | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 7 |
| 2 | NaN | en:stirred-yogurts | France | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 4 |
| 3 | NaN | en:whole-milk-yogurts | France | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 4 |
| 4 | NaN | en:yogurts | France | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 3 |
| 0 | NaN | en:dairies | France | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 2 |